Geometry of Polysemy

نویسندگان

  • Jiaqi Mu
  • Suma Bhat
  • Pramod Viswanath
چکیده

Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec. However, a single vector does not suffice to model the polysemous nature of many (frequent) words, i.e., words with multiple meanings. In this paper, we propose a three-fold approach for unsupervised polysemy modeling: (a) context representations, (b) sense induction and disambiguation and (c) lexeme (as a word and sense pair) representations. A key feature of our work is the finding that a sentence containing a target word is well represented by a low rank subspace, instead of a point in a vector space. We then show that the subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace), which we use to disambiguate senses using a clustering algorithm that harnesses the Grassmannian geometry of the representations. The disambiguation algorithm, which we call K-Grassmeans, leads to a procedure to label the different senses of the target word in the corpus – yielding lexeme vector representations, all in an unsupervised manner starting from a large (Wikipedia) corpus in English. Apart from several prototypical target (word,sense) examples and a host of empirical studies to intuit and justify the various geometric representations, we validate our algorithms on standard sense induction and disambiguation datasets and present new state-of-the-art results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Allegory: Structure, Interpretation and Polysemy

In allegories polysemy relates not only to the context and the audience’s understanding but also to the structural characters of these texts. This paper investigates the function of structural and narrative properties in the creation of multiple interpretations of an allegory. Focusing on the events and following a unique story-line is the most important trait in helping to read the alleg...

متن کامل

The polysemy of the words that children learn over time

Here we study polysemy as a potential learning bias in vocabulary learning in children. We employ a massive set of transcriptions of conversations between children and adults in English, to analyze the evolution of mean polysemy in the words produced by children whose ages range between 10 and 60 months. Our results show that mean polysemy in children increases over time in two phases, i.e. a f...

متن کامل

A pragmatic solution to the polysemy paradox*

This paper investigates the phenomenon of polysemy: a single word with two or multiple related senses (e.g. run in run a half marathon, run on gasoline, run a shop, etc.). While it is largely unproblematic from the point of view of communication, polysemy poses a range of theoretical and descriptive problems. This has been described as the polysemy paradox (Ravin & Leacock 2000). In this paper,...

متن کامل

Effects of Polysemy in Lexical Decision and Naming: An Alternative to Lexical Access Accounts

The effects of polysemy (number of meanings) and word frequency were examined in lexical decision and naming tasks. Polysemy effects were observed in both tasks. In the lexical decision task, highand low-frequency words produced identical polysemy effects. In the naming task, however, polysemy interacted with frequency, with polysemy effects being limited to low-frequency words. When degraded s...

متن کامل

A Cognitive Account of the Lexical Polysemy of Chinese Kai

Graduate Institute of English, National Taiwan Normal University Abstract Since polysemy has multiple but related senses, finding any coherent system would seem impossible. But its senses are not random. When we look at inferences among them, it becomes clear that there must be a systematic structure of some kind. Based on the prototype theory, which views lexical items as constituting natural ...

متن کامل

Automatic Biomedical Term Polysemy Detection

Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies and ontologies. In this paper, we present a nov...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1610.07569  شماره 

صفحات  -

تاریخ انتشار 2016